Coding Code: Investigating Student’s Data Science Skills with Qualitative Methods

Dr. Allison Theobold

Today’s layout


Qualitative research

“Qualitative researchers strive to understand the meaning people have constructed about their world and their experiences.” (Sharan B. Merriam 2002)


“Qualitative research is an effort to understand situations in their uniqueness as part of a particular context and the interactions there. This understanding is an end in itself.” (Patton 1990)

What are the principles of qualitative research?


  • The researcher is the primary instrument for data collection and data analysis

  • The analysis seeks to find emerging themes

  • The product of a qualitative study is richly descriptive

How might this look?


Sample Selection

Select a sample from which the most can be learned!

Data Collection

Major sources of data – interviews, observations, documents

Data Analysis

Compare units of data to find common patterns across the data

Investigating student learning through code

Warm-up (90 seconds)


RPMA2GrowthSub$Weight[RPMA2GrowthSub$Age == 1]


How would you describe the action(s) being taken in this statement?

A framework for analyzing student’s code (Schulte 2008)

Text Surface Program Execution Function
Macrostructure Understanding the overall structure of the program Understanding the “algorithm” of the program Understanding the goal / purpose of the program (in its context)
Relations References between blocks, e.g., method calls, object creation Sequence of method calls, object sequence diagrams Understanding how sub-goals are related to goals, how function is achieved by subfunctions
Blocks Regions of interest (ROI) that syntactically or semantically build a unit Operation of a block, a method, or a ROI (as a sequence of statements) Function of a block, may be seen as a sub-goal
Atoms Language elements Operation of a statement Function of a statement, only understandable in context

Coding student’s code


RPMA2GrowthSub$Weight[RPMA2GrowthSub$Age == 1]


Descriptive code

“Filters a vector of values using extraction operator, based on an equality relation with a variable selected from dataframe using $ operator”

In-vivo code

“Uses [ ] and == to filter vector, uses $ to select variable”

Uncovering emergent themes

linearAnterior <- lm(PADataNoOutlier$Lipid ~ PADataNoOutlier$PSUA)

early <- subset(RPMA2Growth, StockYear < 2006)  

Weight5 <- mean(RPMA2GrowthSub$Weight[RPMA2GrowthSub$Age == 5], na.rm = TRUE)

gas <- gas[!(substr(gas$sampleID,3,3) %in% c("b","c")), ]   

obsD <- subset(gas, gas$carboy == "D")$N15_N2_Ar

lowerCIBound <- pMat[1:mlleIndex,1][which.min(abs(mlleCI+likelihoods[1:mlleIndex]))]

Data wrangling

Statements of code whose purpose is to prepare a dataset for analysis and / or visualization

Sub-themes

  • selecting variables
  • filtering observations
  • mutating variables

An alternative direction


Process coding:

uses gerunds (“-ing” words) to connote action in the data (Saldana 2013)


  • Particularly relevant to describing the processes of human actions
  • Can be intertwined with time, such that actions can emerge, change, or occur in particular sequences.

Practical considerations

How much code should I collect?

  • Driven by the research question!
    • Amount of each student’s code
    • Number of students

How do readers trust my analysis?

  • Trust comes from:

    • confirmability
    • reliability
    • credibility
    • transferability


Excellent resources: Creswell & Poth (2018); Merriam & Tisdell (2016); Miles et al. (2020)

How could this be used?

Concept dependence

How does a student’s concept model of a dataset inform how they filter data?

(atoms; program execution)

Program environment

How do the visualizations produced by students who learn ggplot differ from those who learn “base” R?

(blocks; program execution)

Linguistic structure

How do students name objects they will use later?

(relationships; text)

Learning trajectory

How do students’ exploratory data analyses change over the duration of a course?

(macrostructure; function / purpose)

Why is this important for data science education?

Theobold et al. (2023)


How can we distinguish merely interesting learning from effective learning (Wiggins and McTighe 2005)?

Questions?

References

Corbin, Joseph, and Allan Strauss. 2008. Basics of qualitative research: Techniques and procedures for developing grounded theory. Thousand Oaks: Sage.
Creswell, J. W., and C. N. Poth. 2018. Qualitative Inquiry & Research Design. Thousand Oaks, CA: Sage.
Merriam, S. B., and E. J. Tisdell. 2016. Qualitative Research. San Francisco, CA: John Wiley & Sons.
Merriam, Sharan B. 2002. Qualitative Research in Practice: Examples for Discussion and Analysis. 1st ed. New York: John Wiley & Sons.
Miles, M. B., A. M. Huberman, and J. Saldaña. 2020. Qualitative Data Analysis. Thousand Oaks, CA: Sage.
Patton, Mary Q. 1990. Qualitative Evaualuation Methods. 2nd ed. Thousand Oaks: Sage.
Saldana, J. 2013. The Coding Maual for Qualitative Researchers. Thousand Oaks: Sage.
Schulte, Carsten. 2008. “Block Model.” Proceedings of the Fourth International Workshop on Computing Education Research, September. https://doi.org/10.1145/1404520.1404535.
Theobold, Allison S., Megan M. Wickstrom, and Stacey A. Hancock. 2023. Coding Code: Qualitative Methods for Investigating Data Science Skills.”
Wiggins, G., and J. McTighe. 2005. Understanding by Design. 2nd ed. Alexandria: Association for Supervision; Curriculum Development (ASCD).